The rejection of claims 59, 60, 64, and 65 under 35 U.S.C. § 102(b) as being 
anticipated by Takase et aL, "Genes Encoding Two Lipoproteins in the leuS-dacA Region of 
the Escherichia coli Chromosome," J. Bac. 169:5692-5699 (1987) ("Takase") is respectfully 
traversed. Takase relates to the coding of two lipoproteins by two genes, rlpA and rlpB, 
located in the leuS-dacA region on the Escherichia coli chromosome. The rlpA gene encodes 
for a lipoprotein having molecular weight of 36K. Figure 6 of the reference details the 
sequence of the 36K lipoprotein gene rlpA and its 5'- and 3'- flanking regions and the amino 
acid sequences deduced from the nucleotide sequence. The position of the PTO is that this 
sequence matches that of the sequence encoding the claimed 5 subunit. Applicant 
respectfully disagrees. The sequence disclosed in Figure 6 of Takase, a sequence of 1408 
base pairs, is not the holA sequence of the present invention. In contrast, it is the rlpA gene. 
Takase also discloses a rlpB gene of the E. coli chromosome. At the end of the sequence of 
the rlpB gene shown in Figure 7, the last 230 base pairs, which were not discussed, constitute 
a sequence that encodes the first 20-25% of the holA gene sequence. Takase did not 
recognize this to be an open reading frame of a putative unknown gene, nor did the reference 
disclose the complete sequence of the holA gene (see diagram attached as Exhibit A showing 
the overlap between the disclosed rlpB gene of Takase and the hoi A gene encoding the 
claimed 5 subunit). Thus, the sequences disclosed in Takase are distinct from the holA gene. 
There is no sequence in Takase relevant to the claimed invention other than the first 20-25% 
of the holA gene sequence. Accordingly, the 5 protein subunit of polymerase III holoenzyme 
and the gene encoding the 6 protein subunit of the polymerase III holoenzyme of the present 
invention are not disclosed by Takase. Moreover, Takase does not disclose the claimed 
expression system or host cell. 

Further, the burden is on the PTO to establish that the sequences disclosed in 
Takase anticipate the present invention. As noted above and in view of the comments below, 
the PTO has failed to meet this burden and, in fact, cannot meet this burden. The two 
sequences disclosed in Takase (i.e.. Figures 6 and 7) are not the complete sequence of the 
holA gene and, therefore, Takase does not disclose the 5 protein subunit of the polymerase III 
holoenzyme or the gene encoding the 5 protein subunit of the polymerase III holoenzyme. 

In addition, the MPSearch result attached to the August 27, 1999, office action 
supports applicant's argument that Takase fails to disclose the nucleotide or protein 
sequences for the entire 5 subunit of polymerase II holoenzyme. In particular, the PTO relies 
upon the MPSearch result, identified as Accession No. M94267, to show that the nucleotide 
sequence disclosed in Takase is identical to the claimed nucleotide sequence of the present 
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application. However, the sequence of the MPSearch result relied on by the PTO (i.e.. 
Accession No. M94267) is not the same sequence as either of those disclosed in Takase. In 
particular, the sequence of the MPSearch result correlates to Reference 2 of the M94267 
entry, i.e.. Carter et al., "Molecular Cloning, Sequencing, and Overexpression of the 
Structural Gene Encoding the Delta Subunit of Escherichia coli DNA Polymerase III 
Holoenzyme," J. Bacteriology 174:7013-7025 (1992) ("Carter"), which was published on 
October 29, 1992 (i.e., after the effective filing date (January 24, 1992) of the present 
application). This can be confirmed by review of the "reference" categories in the 
MPSearch entry. For example, for Reference 2 (i.e.. Carter), the reference category states 
"REFERENCE 2 (bases 1 to 1 147)." The length of the sequence of Accession No. M94267 
is 1 147 base pairs and, thus. Carter discloses the sequence of Accession No. M94267 (see 
Figure 5 in the copy of Carter attached hereto as Exhibit B). In contrast, for Reference 1 (i.e., 
Takase), the reference category states "REFERENCE 1 (sites)," i.e., Takase does not 
disclose the sequence of Accession No. M94267 and is cited only for certain sites in the 
sequence. Moreover, the MPSearch result relied on by the PTO indicates that there is a 
conflict in the sequence listed for Accession No. M94267 and the sequence disclosed in 
Takase (see "conflict" category of MPSearch Entry). 

Further, as illustrated in the attached e-mail from GSDB, the MPSearch result 
(i.e., Accession No. M94267) relied on by the PTO was first released to the public on 
November 3, 1992 (Exhibit C), i.e., after the effective filing date (January 24, 1992) of the 
present application. The "two sequences [of Takase] are distinct from the sequence in 
accession number M94267" (see Exhibit C). The earliest release date associated with 
Accession No. M94267 is September 5, 1989, which, as noted in Exhibit C, relates to the 
Takase reference. As noted above and in Exhibit C, Takase does not disclose the sequence of 
Accession No. M94267, but discloses two different sequences {rlpA and rlpB). The sequence 
in Accession No. M94267 was not published as part of Takase. Moreover, the sequence in 
Accession No. M94267 was not even deposited into GSDB/GenBank until May 13, 1992, 
i.e., after the effective filing date of the present application. As a result, the MPSearch result 
itself (i.e.. Accession No. M94267) carmot be prior art with respect to the claimed invention 
and any rejection based on this MPSearch result should be withdrawn. 

The burden is on the PTO to establish that the sequence disclosed in the 
MPSearch result relied upon is entitled to the same date for prior art purposes as Takase. The 
PTO has failed to meet this burden and, in fact, cannot meet this burden in view of Exhibit C 
and the previous remarks. Since Takase does not disclose the entire 5 protein subunit nor the 
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eijfire sequence encoding the 6 protein subunit and the MPSearch result is not prior art, there 
ml^rs no basis for an anticipation rejection and the rejection based on Takase must be withdrawn. 

The rejection of claims 54, 55, 57-60, 64, and 65 under 35 U.S.C. § 103(a) for 
obviousness over Takase is respectfully traversed. As stated above, Takase does not disclose 
the entire specified isolated 5 protein subunit of polymerase III holoenzyme, nor the entire 
gene encoding that protein. In particular, Takase discloses only a short portion of the gene 
encoding the 5 protein subunit in Figure 7. In addition, Takase provides no motivation to 
determine the sequence of the remainder of the gene. Specifically, Takase failed to identify 
the open reading frame of the gene for the 6 protein subunit of polymerase III holoenzyme 
and, therefore, provides no motivation or suggestion to determine the remainder of the gene 
encoding the 5 protein subunit. Further, the focus of Takase is on two genes, rlpA and rlpB, 
which are different from the gene encoding the 5 protein subunit of polymerase III 
holoenzyme. As a result, Takase provides no motivation with respect to determining the 
sequence of the gene encoding the 5 protein subunit of polymerase III holoenzyme. 

Moreover, the MPSearch result relied upon by the PTO is not prior art with 
respect to the claimed invention, as described above. Therefore, the rejection based on 
Takase is improper and should be withdrawn. 

In view of the all of the foregoing, applicants submit that this case is in 
condition for allowance and such allowance is earnestly solicited. 
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Using an oligonucleotide liybridizatioa probe, we hare mapped the stractural gene for (he h Mibunit of 
Escherichia ccU DNA polymerase in holocn^me to 14.6 centisomei of (be chromosome. This gene, dcsi9na(ed 
HoiAy was cloned and sequenced. The sequence otholA matches precis^ four amino acid sequences oi>tained 
for tbc amino terminus of S and three interna] dyptic peptides. A -overproducing piasroid (hat directs the 
ezpreision of 8 up to 4^ of the soluMe protein was constructed. Sequence annuls othoLA revealed a ly029-bp 
open reading frame that encodes a protein with a predicted molecular mass of 38,703 Da. holA may reside 
downstream oiripB in an operon, perhaps representtng yet another link between structural genes for the DNA 
polymerase III holoenzyme and proteins involved in membrane blc»genesis. These and other features are 
discussed in terms of genetic regulation •f 6-8ubnnit synthesis. 



DNA polymeraMi 111 (Pol III) holoenzyme (referred to 
here as holoenzyme) is the major replicative complex of 
Escherichia coli. It contains a core DNA Pol III plus seven 
auxiliary proteins that confer upon the holoenzyme special 
properties that distinguish it from simpler polymerases not 
devoted to chromosomal replication (for reviews, see refer- 
ences 27, 37, and 47). These properties include extreme 
pTOcessfvity and a high rate of nucleotide incorporation, 
resistance to physiological levels of salt, a preference for a 
single-stranded DNA-binding protein*coatcd template, and 
the ability to communicate with components of the repli- 
some to effect coordinated replication of leading and lagging 
strands of the replication fork (27, 37, 73, 74, 77). 

Four different forms of DNA Pol III have been purified, 
llie core Pol III is composed of the catalytic a subunit 
{dnaE), e {dnaQ), the 3'-to-5' proofreading subum't, and 8, 
The dimeric Pol III' is formed upon addition of the t subunit 
{dnaX) (35, 61) to Pol III. Pol III' is distinguished from the 
core by its 10-fold higher processivity (8, 9) and stimulation 
by physiological concentrations of spennid&ne (8). Addition 
of the yh complex {y, h, h', x, and to Pol ID' forms Pol 
IIP, which is about fivefold more proccssivc than Pol HI' 
and is stimulated by single-stranded DNA-binding protein 
(8). 

Holoenzyme includes all subunits found in Pol III*, plus 
the p subunit. Holoenzyme is the only form of Pol III that 
can efficiently replicate natural DNA template molecules in 
vitro in the presence of other replicative proteins (9, 24, 41, 
43, 74). The holoenzyme is highly processivc. It remains 
assCKiaied with tetnplates in vitro for 30 to 40 min and may 
be suflficienily processive to replicate the entire £. coii 
chromosome without dissociation (22, 36, 74). To form a 
highly processive complex, holoenzyme must first combine 
with the primed template in an ATP-dependcnt reaction to 
form an initiation complex (71, 72). This complex is stable 
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and is isolable by gel filtration {21, 72). Addition of the four 
required dcoxynucleoside triphosphates results in the rapid 
replication of template DNA witliout dissociation of poly- 
merase (9, 23), The ATP dependence of itutiation complex 
formation presumabty derives torn the requirement for ATP 
in -y5-mediated transfer of p to a primed template (23, 47). 
Transfer of ^ can also be effected l^y t5 or Td' complexes in 
ATP-dependent reactions analogous to the yb reaction (47). 

It has been propaset) that holoenzyme functions as an 
asymmetric, dimeric polymerase complex in which two 
functionally distinct polymerase halves are responsible for 
replication of either the leading or the lagging iUrand (23, 33, 
39; reviewed in references 39 and 40). Functional specializa- 
tion of the two polymerase halves could be used to solve the 
problem at the replication fork that results from continuous 
replication of the leading strand requiring a polymerase to 
remain bound throughout the entire cycle of replication and 
from discontinuous replication of the lagging strand requir- 
ing a polymerase to synthesize a 1,000- to 2,000-nucleotide 
Okazaki fragment, dissociate, and initiate synthesis of a new 
Okazaki fragment In about 1 s. 

The a&ymmetric dimer hypothesis stemmed from experi- 
ments to examine the ATP dependence of holoenzyme 
activity, in which the ATP analog ATP7S was found to 
mediate formation of only one-half the amount of initiation 
complex as ATT and to cause dissociation of one half of the 
initiation complex formed with ATP (23). These results 
reveal a functional asymmetry between two halves of the 
dimer: one half can form initiation complexes by using 
ATP7S, whereas the other is sensitive to this analog. 

Additional support for the asymmetnc dimer hypothesis 
derives from examination of the relationship between the t 
and y subunits. Both subunits are products of drtaX; y is 
produced following ribosomal frameshifting to a reading 
frame that contams a termmation codon (3, 11, 67). Thus, the 
ami no-terminal 47.5 kDa of r and all but the last amino acid 
residue of y arc identical. The t subunit, a DNA-dependeht 
ATPasc (30, 66), contains an additional 23.6-kDa carboxyl 
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terminus that has sequence similarity to several proteins that 
interact with nucleic adds (38). That both subunits have 
ATP-binding sites (21, 30, 66) and are found within individ- 
ual holocnzyme complexes (22) raises the possibility that 
these two subunits perform similar roles in different halves 
of holoenzymc. Consistent with this notion are data reported 
by O'Donnell and Studwell (47), who isolated minute quan- 
tities of S and 8' and showed that proce^sive DNA synthesis, 
equivalent to that of holoenzyme, can be cfFected with a 
minimal combination of an o€ complex. ^, and yb, t8, or ih'. 

A rigorous test of the asymmetric dimer hypothesis re- 
quires the availability of large quantities of each holocnzyme 
subunit. As a step toward this goal, we have isolated the 
structural gene for 6. In this report, we describe the identi- 
fication of the 5 structural gene and its cloning, sequencing, 
analysis, and overproduction. 



MATERIALS AND METHODS 

Abbreviations. Abbreviations used are as follows: SDS, 
sodium dodecyl sulfate; IPTG, isopropyl-p-D-galactopyra- 
noside; HPLC, high-pressure liquid chromatography; and 
HIV, human immunodeficiency vims. 

Chemicals. SDS, urea, TV^'-mclhylcnc-blsacrylamide, 
acrylamide, and Cbomassie brilliant blue R-250 were ob- 
tained from Bio-Rad. [-y-^^P]ATP was purchased from ICN. 
Tris-HCl, bovine scrum albumin, polyvinylpyrrolidone, dcx- 
tran sulfate, and Ficoll were obtained from Sigma. Low- 
molecular -weight protein standards were purchased from 
Pharmacia. SeaKem LE agarose was purchased from FMC 
BioFroducts. Bacteriophage X DNA digested with HindlU 
was purchased from Promega and used as a double-stranded 
DNA molecular weight marker. AD other chemicals were 
reagent grade. 

Oligonucleotides. Oligonucleotides used in this work were 
synthesized by the University of Colorado Cancer Center 
Macromolecular Resources Core Facility. The two oligonu- 
cleotides used to reconstruct the 5' end of the structural gene 
for 5 were S'-GATCTAGGAGGTAATAAATAATGATCCG 
CCrGTAC-3' and S'-AGGCGGATCATTATTTATTACCr 
CCTA-3'. The 5-gcne-specific 51-mer probe was 5'-GCGGC 
GTATGnTTACTTGGTAACGATCCTCTGTTATTGCAG 
GAAAGOCAG-3'. 

Bacterial straiDS, plasmlds, and media. MGClOO, an isolate 
of RS320 [^(hcIPOZYA)U]69 Mon araD}39strA supF; gift 
of R. Sclafani, University of Colorado Health Sciences 
Center] that is resistant to a persistent phage that contami- 
nates our fermentor (possibly bacteriophage Tl), was used 
to isolate holocnzyme. Chromosomal DNA was purified 
from MAFI02, a derivative of the wild-type strara MG1655 
(14) into which lexA3 and uvrD (70) had been inUoduced by 
PI transduction. HBlOl [si^E44 hsdSZO {v^-m^''') recA13 
ara'14 proAl lacYl galK2 rpsLlOxyl-S mtl-l ] (4) was used as 
the host strain for ovcrcxpression of the structural gene for 
6. XLl Blue irproAB lacJ^ZJ^MlS TnlO (TctyrccAI endAl 
gyrA96 thi hsdJU? supE44 relAl fac], purchased from Strat- 
agcne, was the recipient strain in routine bacterial transfor- 
mation experiments. pBlueScript II SK+ (Stratagene) and 
pUC19 (68, 75) were used as cloning vectors. pRT581 is a 
plasmid used to overproduce the HIV reverse transcriptase 
subunit p51 (62a). pJCl is a plasmid used to overproduce the 
HIV nuclcocapsid (76). 

L broth (42) was used for routine growth of bacterial 
strains. Medium used in the S-overexpression experiment 
contained 0.9% yeast extract, 0.8% peptone, and 0.9% 



potassium phosphate, pH 7.2. When required, ampicillin and 
tctracycUnc were used at 50 and 10 M-g/ml, respectively. 

Enzymes. Reaction enzymes and T4 DNA ligase 
(Promega), calf intestinal alkaline phosphatase (Boehringcr 
Mannheim Biochcmicals), and T4 polynucleotide kinase 
(New England Biolabs) were used according to the manu- 
facturers' instructions. The p subunit of holocnzyme was 
purified to homogeneity as described elsewhere (48) to a final 
concentration of 1.4 mg/ml. 

Preparatlott of the & snbimtt for amino add sequence 
analysis. Holocnzyme was purified as described elsewhere 
(46) from 6 kg of MGClOO cells. Tlirec preparatioas varying 
from 233 to 253 ^lg/ml and 5.4 x Itf to 7.0 x 10^ U/mg were 
used. 

Protein was concentrated by vacuum dialysis (O'C, 4 h) in 
a collodion bag (Schleicher & Schuell; molecular mass cutoff 
= 25,000 Da). The dialysis buffer was 63 mM Tris-HO, pH 
8.8, 10% glycerol, and 10 mM dithtothreitol. After removal 
of sample, the colkKlion bag was incubated in dialysis buffer 
for an additional 30 min to backwash the membrane. Three 
concentrated (8.5-mg/ml) holocnzyme samples (420, 508, 
and 508 ijig) were prepared; SDS was added to each to a final 
concentration of 1%. Two samples (420 and 508 p-g) were 
incubated for 5 min in a boiling water bath and loaded into 
two wells of a 1.5-mm SDS-7.5 to 17.5% polyacrylamide gel. 
For quantitation, 40 M'g of purified ^ subunit was loaded onto 
each half of the gel. After electrophoresis for 16 h at 7 mA 
according to the method of Laemmli (28), one half of the gel, 
containing only the ^-subunit standard, was stained with 
Coomassie brilliant blue R-250. The second half, containing 
holocnzyme and the p-subunit standard, was electrophoret- 
ically transferred onto a nitrocellulose membrane (Schle- 
icher & Schuell; 0.45-|iin pore size) (65) for 2 h at 0.5 A in 
transfer buffer (25 mM Tris-HCl, pH 8.2, 192 mM glycine, 
20% methanol) by using a Hoefer Transblot apparatus. After 
Uansfcr, the gel was stained with Coomassie brilliant blue 
R-250 to allow quantitative comparison to the half of the gel 
not transfened. The filter was rinsed in H^O for 2 min, 
stained for 2 min (0.1% amino black IOB-10% acetic acid- 
45% methanol), destained for 3 mm (10% acetic acid-45% 
methanol), and rinsed in water for 2 min (34). 

The stained gels and the stained filter were scanned c«i a 
Molecular Dynamics computing densitometer model 300S 
and quantitated by using Molecular Dynamics ImageQuant 
version 3.0. The filter was then frozen while wet in a plastic, 
scalable bag and stored at -70°C until used for amino acid 
sequence analysis. Volumetric integration was used to esti- 
mate the amount of h transferred onto the filter. Comparison 
of the p-subunil standard on the transferred and nontrans- 
ferred gels indicated that about 75% of or 30 ^.g, had 
transferred out of the gel. Adsorption to the filter was 
assumed to be quantitative. Analysis of the filter showed that 
the mass of ^ in holoenzyme was 82% of the mass of p 
sUndard. Therefore, about 25 ^g of P from holocnzyme had 
transferred to the membrane. In the two holoenzyme lanes, 
the mass of 6 was calculated to be 70 and 80% of the mass of 
p. Wc conclude that about 18 and 22 \k% (about 400 to 500 
pmol) of h was transferred to the filter. 

Amino add sequence determination. Approximately 40 fig 
of the 8 subunit was cut out of two nitrocellulose membranes 
and digested in situ with trypsin. Tiyptic fragments of 8 were 
separated by using a narrow-bore Brownlee Aquapore Bu- 
300 column reversed-phase HPLC system. Amino-terminal 
sequence analysis of the peptides was performed on an 
Applied Biosystems 477A protein sequencer (1, 15). 

Amino-terminal sequence analysis of intact 8 subunit was 
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performed by the Universit>' of Colorado Cancer Center 
Protein Microsequcncing Core Facility. Approximately 20 
|i.g of the 8 subunit (from the third sample of concentrated 
hoJoenzyme) was cut out of the ProBlott membrane (Applied 
Biosjrstems) and sequenced on an Applied Biosystems 477A 
protein sequencer equipped with an on-line PTll C-18 HPLC 
cartridge. Elcctrophoretic transfer and determination of 
transfer efficiency and mass of 5 transferred were as de- 
scribed for tiie nitrocellulose membrane. After transfer, the 
membrane was rinsed in HjO for 2 min, stained for 2 min 
(0,1% Coomassie brilliant blue R-25a-50% methanol), des- 
taincd for 3 min (10% acetic acid-50% methanol), and air 
dried (34). 

PUsmid pnrlflcstion. Large-scale plasmid DNA isolation 
was performed by the alkaline-SDS lysis method (2). Plas- 
mid DNA was further purified by processing through two 
CsO-ethidium bromide equilibrium density gradients in a 
Sorvall T-127,0 rotor for 20 h at 187,800 x or 40 h at 
131,900 X ^. 

Isolation of high-molecnijir-weight genomic DNA. C^ells 
were grown at 37*C to mid-exponential phase in L broth, 
pelleted by centrifugation, and resuspendcd in a solution of 
Tris-sucrose (10% sucrose, 50 mM Tris, pH 8,0). The 
suspension of cells was brought lo 70 mM EDTAby addition 
of 0.25 M EDTA, pH 8.0, and incubated on ice for 5 min. 
Egg white lysozyme, RNase, and SDS were added to final 
concentrations of 0.5 mg/ml, 0.5 M-g/ml, and 1%, respec- 
tively. A 2-h incubation on ice was followed by addition of a 
fresh solution of proteinase K to a iinal concentration of 100 
jjig/ml. The lysate was incubated at 3T*C overnight, phenol- 
chloroform extracted five times, and processed through two 
55% (wt/vol) CsCl equilibrium density gradients in a Sorvall 
T-127.0 rotor for 20 h at 187,800 x g. Tcn-microliter aliquots 
of 1-ml fractions of the gradient were subjected to agarose 
gel electrophoresis and ethidium bromide staining to identify 
fractions containing DNA. Typically, DNA was found in two 
'fractions. 

Agarose gd electrophoresis. Horizontal agarose gels con- 
taining between 0.6 and 1.5% agarose in TBE (89 mM Tris 
base, 2.75 mM EDTA, 89 mM boric acid) were run in a 
Hoefcr submarine gel apparatus. Gels used to separate 
chromosomal DNA restriction fragments (2 to 4 \x.g of DNA 
per lane) were run at 4°C for 16 to 24 h at 0.5 to 2 V/cm. Gels 
used for analysis of plasmid DNA were run at room temper- 
ature for 1 to 5 h at 2 to 10 V/cm. After electrophoresis, gels 
were stained for 10 min in a 0.8-^.g/inl solution of etiiidium 
bromide dissolved in TBE. Gels were illuminated with 
254-nm UV light from a >Fotod|yDe transiUuminator and 
photographed with Polaroid type 667 film. 

Gels used for purification of restriction fragments were 
made to contain between 0.6 and 1.0% agarose in TAE (40 
mM Tris base, 20 mM acetic acid, 10 mM EDTA) and were 
run at room temperature at not more than 5 V/cm. Not more 
than 2 \ig of DNA was loaded onto the gel in one wide well. 
The DNA fragment of interest was localized (without UV 
irradiation of the gel to avoid introducing potentially muta- 
genic, UV-induced lesions) by the following method. After 
electrophoresis, the gel was cut lengthwise to remove a thin 
gel slice containing molecular weight markers and a repre- 
sentative sample of the separated DNA, The gel slice was 
stained with ethidium bromide and UV irradiated to visual- 
ize the DNA fragments. A slice was made immediately 
below the DNA fragment of interest to mark its location, llic 
remainder of the gel that had not been stained or irradiated 
was positioned next to the stained gel, and the DNA frag- 
ment of interest was localized. The fragment was excised 



from the unstained gel and purified by using the GcncClean 
II DNA purification kit from Bio 101. Chromosomal restric- 
tion fragments were purified by using an Elutrap apparatus 
(Schleicher & Schuell) according to the manufacturer's 
instructions. 

Southern blotting. An agarose gel containing chromosomal 
DNA restriction fragments separated by electrophoresis was 
incubated for 10 to 15 min in 0.25 M HCl, rinsed briefly with 
deionizcd H2O, and incubated for 20 min in denaturing 
solution (0.5 M NaOH, 1.5 M NaQ). A second, 30-min 
incubation in fresh denaturing solution was followed by a 
30-min incubation in lOx SSC (1.5 M NaQ, 0.15 M Na 
citrate • 2H2O). DNA was transferred from the agarose gel 
to a GeneScreen iiylon membrane (New England Nuclear) 
for 12 to 18 h by usmg a transfer buffer of lOx SSC and a 
conventional Southern blot assembly (59). After transfer, the 
membrane was incubated for 1 min in 0.4 M NaOH and 
neutralized for 1 min in 25 mM sodium phosphate, pH 6.0. 
The damp membrane was subjected to 1.6 kJ of UV light per 
from a germicidal lamp to cross-link the DNA lo the 
membrane. 

PrepanitioD of 5'<<iid-labeled oUgonadeoUile. Fifty pico- 
moles of the 51-mer oligonucleotide was 5' end labeled with 
^^P by incubation with 160 itCi of [-y-^^P]ATP (6,000 
mmol, 160 piQ/p-1) and 10 U of T4 polynucleotide kinase in 
50 (U of kinase buffer (50 mM Tris-HQ, pH 7.5, 10 mM 
MgClj, 10 mM dlthiothreitol) for 30 min at 37^C. Oligonu- 
cleotide was separated from nucleotide by using a 1-ml G-25 
Scphadex (Pharmacia) gel filtration spin column equilibrated 
and developed with 20 mM Tris-HQ, pH 8.0, 1 mM EDTA, 
and 50 mM NaQ. Purified oligonucleotide was frozen at 
-20^C until used in hybridization experiments. 

Identification of chromosomal restriction fragments comple- 
mentary to the 51-mer oligonucleotide. The GeneScreen 
membrane, to which restriction enzyme-digested, chromo- 
somal DNA was transferred, was incubated in a scalable 
plastic bag containing :30 ml of prehybridization solution 
(0.2% bovine serum albumin, 0.2% polyvinylpyrrolidone 
[molecular weight » 40,000), 0.2% Ficoll 400,000, 0.1% 
sodmm PP,, 1.0% SDS, 10% dextran sulfate. 1 M NaQ, 50 
mM Tris-HCI [pH 7.5], and 100 jig of low-molecular-wcight, 
denatured sahnon testis DNA per ml) for 10 to 20 h in a 65*C 
shaking water bath. Radioactive 51-mcr oligonucleotide was 
added to the solution to yield 2 x lO' cpm/ml, and incubation 
was continued for 12 to 18 h. After hybridization, the 
membrane was washed by two 5-min incubations at room 
temperature In washing buffer (0.3 M NaCl, 60 mM Tris-HQ 
[pH 8.0J, 2 mM EDTA), followed by two 30-min washes at 
60°C in washing buffer supplemented with SDS to a final 
conccntratwu of 1.0% and two 30-min washes with 0.1 x 
washing buffer at room temperature. All washes were per- 
formed with constant agitation. The membrane was dried, 
wrapped in plastic wrap, and exposed for 4 to 48 h on a 
Molecular I^rnamics Phosphorimager cassette containing a 
phosphor screen. T3ie phosphor screen was scanned by 
using a Molecular Dynamics Phosphorimager model 400E, 
and the data were analyzed by using the ImageQuaat version 
3 program. Images were printed on a Hewlett-Packard 
LaserJet UI modified to allow printing of 256 shades of gray. 

Colony bybrMizadon* The original clone of the 8 gene was 
identified by colony hybridization (56). The radiolabeled 
51-mer oligonucleotide was used as the probe. 

DNA sequencing. Tlie structural gene for S was sequenced 
by Lark Sequencuig Technologies, Inc., Houston, Tex., by 
the dideoxy chain termination method of Sanger ct al. (57). 
The gene was sequenced in both directions. 
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FIG. 1. Preparation of the 8 subunit for amino acid sequence analysis. (A) Five hundred micrograius of holoenzyme was coficentrated, 
resohfcd on an SDS-7.5 to 17.5% potyacrylamidc gel, and traDsfcrrcd onto a nitrocellulose filter. The filter was stained and scanned by 
densitomeliy; the resulting oonipatcr-generated image is shown. The positions of molecular mass standards arc shown in kilodaltons on the 
left; the holoenzyme subunits are labeled on the right. (B) The fi subunit was cut out of the blot shown in panel A and digested directly on 
the'filter with tiypsin. The resulting peptides were separated by reversed-phase HPLC. Peptides corresponding to the peaks immediately 
below the numeric labels were subjected to amino-tenmnAl sequencing. 



Overcxprcssion of the h subunit. Cells from a fresh over- 
night liquid culture were diluted 1:100 in 25 ml of fresh broth 
and grown at 3TC, with vigorous shaking, to an^^ of 0.4. 
IFTG was added to 10 ml of each culture to a final concen- 
tration of 1 mM, and growth was monitored by measurement 
of the /Ifioo at 30-inin intervals. After 4 h, 1 ml of cells was 
pelleted by ccnirifugation for 5 min, rcsuspcnded in 100 ftl of 
lysis buffer (100 mM Tris-HCl, pH 7,5, 100 mM NaCl, 5% 
SDS, 100 mM p-mercapiocthanol, 15% glycerol, 0.02% 
bromphenol blue) and incubated in a boiling water bath for 
10 min. Following a second, 2-min ccntrifugation to remcvc 
ceil debris, the cell lysate was incubated in a boiling water 
bath for 5 min, and a portion representing 0.2 Af^tm 
of cells was loaded onto an SDS-7.5 to 17.5% polyacxyl- 
amidc gel. After electrophoresis, the gel was stained with 
Coomassie brilliant blue R-250 and scanned on a Molecular 
Dynamics scanning densitometer, model 300S, to quantitate 
the overproduction of the 8 subunit. 

Computer anatyste. GenBank (version 67) was searched by 
using the TFASTA program that translated the GenBank 
DNA sequence in all six reading frames and performed a 
similarity search by the Pearson and Lipman method (51). 
Protein translation, inverted repeat analysis, and codon 
usage analysis were performed by using PC/GENE (Intclli- 
genetics) nmning on an IBM personal computer. 

Nocleotide sequence accession number, llie GenBank nu- 
cleotide sequence accession ntmiber of the sequence re- 
ported in this paper is M94267. 

RESULTS 

To identify and clone the structural gene for 8, we used a 
reverse genetic approach in which terminal and internal 
peptide sequences for 5 were obtained and used to design an 
oligonucleotide hybridization probe to isolate the gene. 

Amino acid seqaenccs of 6 peptides. Isolation of peptides in 
adequate quantity for sequence analysis required prepara- 
tion of a highly concentrated sample of holoenz3niic, so that 
10 to 20 M-g of 5 protein could be separated from other 



holoenzyme subunits from each lane of an SDS-polyacryl- 
amide gel. Roughly 500 (tg of holoenzyme was applied to 
each lane of a gel (Fig. lA) and electrophoreticaily trans- 
ferred onto a ProBlott membrane or nitrocellulose mem- 
brane. Densitometo^ of the filters and gels indicated that 18 
to 22 M.g of 5 had been transferred from each lane. 

The 5 subunit was excised from the ProBlott membrane 
and subjected to amino-tcrmiaal sequencing as described in 
Materials and Methods. Figure 2A shows the 21-amino-acid 
sequence that was obtained (sequence 1). Four amino acid 
residues (designated X) within the 21-amino-acid sequence 
were not identifiable. 

To obtain internal amino add sequences of the 5 protein, 
fragments of 8 were produced by tiyplic digestion of 6. The 
fragments were separated on a microbore HPLC colimm, 
and three wcll-resolved peptides (Fig. IB) were selected for 
amino-tcrminal sequencing to obtain three internal amino 
acid sequences (Fig. lA, sequences 2, 3, and 4). The first 
three residues of sequence 2 overlapped the amino-terminal 
sequence of & by 3 amino acid residues (Fig. 2A, underlined 
residues). 

Data b«se searches using Che four S peptide sequences. 

Sequence identity searches were performed between each of 
the four h peptide sequences and DNA sequences in Gen- 
Bank (version 67) translated in aU six reading frames. A 
partial match to sequence 1 and a complete match to 
sequence 2 were found to amino adds coded by 110 bp of 
DNA sequence reported downstream of the ripB gene se- 
quence (62). No matches to sequence 3 or 4 were found. The 
significance of the matches was unclear since they occurred 
in two diflerent reading frames and contained mismatches 
and omissions relative to sequence 1. 

The sequence that contained the imperfect but highly 
similar match was downstream of the HpB gene (62) and tiius 
might not have been subjected to the same scrutiny as ripB 
sequences. Although the mismatches were likely due to 
enors in the DNA sequence, we dedded to isolate the gene 
from a wild type £. coii K-12 strain and examine whether the 
DNA sequence exactly matched all of our b protein se- 
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MIXLYPEQLXAOLNXGIX A A T 
A A Y LLLCyDPtLLOESODAV 
LSLLWPDG 
VEQAVNDAAHFTPF 



GGTAACTGATQATTC6GTT6TACCCGCAACAACCTGCTACGCTCAATGAA 
IRIYPEQPATLNE 



RF1 - L N 
RF2 G H - - 

#1 MixLYPEGLXAQLNX 
751 , .GCGCTCGGCCGCCCCGTATCTTTTACTTGCTAACGATCCTCTCTTATTGC 



RF1 
RF2 



G L G R 
G L X 



LGNDPLILO 



A A Y L L 
#1 G L X A A Y 
#2 AAYLLLGNDPLLLO 
801.,AGGAAAGCCAGGACGCTGTTCGTCAGGTAGCTGCGCCACAAGGATTCGAA 
RF2 ESQDAVRQVAAQGFEE 

§Z E S Q D A V 

II G. 2. Peptides of the fi subunit. (A) Sequence 1 is the amino* 
tenninal sequence of &. Sequences 2, 3, ajK) 4 were obtaiacd from 
purified tryptic fragments of 6. The first three residues of sequence 
2 and the last three residues of sequence 1 arc identical (uoderlined). 
"X" indicales undetermined residues. (B) DNA sequence and base 
oumbering arc as reported elsewhere (62). The two adjacent tenni- 
natioD oodons of HpB are underlined. RFl and RF2 indicate protein 
sequences predicted from two reading frames of the DNA. Se- 
quences 1 and 2, as presented in panel A, are indicated by and 
"#2". Matches between sequences 1 and 2 and amino acids 
predicted £rom the DNA sequence are indicated by colons. 



quence information. Wc selected a 51-nucIcotidc stretch {see 
Materials and Methods) that, when translated, corresponded 
perfectly to the first 17 amino acids of sequence 2, thus 
minimizing the likelihood that the oligonucleotide contained 
errors relative to the true sequence of the chromosome. Hie 
oligonucleotide is specific to DNA near the 5' end of the 
predicted gene encoding 6. 

Physical mapping of the stnictural gene for &. To define a 
restriction map of the region of the chromosome containing 
the structural gene for 8, MAF102 chromosomal DNA was 
digested with all single- and double-enzyme combinations of 
£coRI, EcoRV, BamHl, BgHl, /fmdlll, and PxtL The 
digested DNA samples were transferred to a GcneScrecn 
nylon membrane and hybridized with the &-gcnc-specific, 
radiolabeled, 51-mcr oligonucleotide. Autoradiography re- 
vealed the sizes of restriction fragments complementary to 
the oligonucleotide (Fig. 3A). From the blot (Fig. 3A), a 
restriction map (Fig. 3B) of the region of the chromosome 
containing the structural gene for 8 was constructed and 
used to identify a restriction fragment that contained the 
entire structural gene for 8. The oligonucleotide probe 
hybridized to a 3.6-kb fragment (Fig. 3B), indicating 
that thLs fragment contained at least the 5' end of the gene. 
Digestion with £coRV and Bgdl produced a I.6-kb restric- 



H VHHRRGGBPBPVPV 
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for 8 



r/pB 



PJRC102 

no. 3. Restriction map of the region of the chromosome con- 
tainiiig the 8-subtinit stnictural gene. (A) MAF102 chromosomal 
DNA was blotted onto QeneScreen oylon membrane and hybridized 
with the 51'mer oligonucleotide specific for the 6 structural gene. 
Restriction enzyme abbreviations: BamHl; GfBgill; II, /fmdIII; 

Fstl; R, EcoKl; V, EcoK\* Size standards expressed as kilobases 
are indicated to the left. (B) A restriction map of the region of the 
chromosome containing the structural gene for 8 was constructed on 
the basis of the restriction digestion patterns observed in the 
Southern blot. The 3.6-kb Bfffll fragment that was cloned into 
pBlueScript U SK+ to generate pJRClQ2 is enlarged to show the 
structural gene for 6 and the gene HpB, 



tion fragment (Fig. 3B) complementary to the oligonucleo- 
tide used as a probe. These restriction fragment sizes plus 
others were consistent with the restriction map reported bj^ 
Kohara et al. (25) (discussed below). The gene HpB^ which is 



(613) 992-8247 Order # 01237334DP0O853032 Thu Nov 9 04:15:23 2000 Page 1 of 8 



7018 



CARTER ET AJ^. 



J. Bacteriol. 



immediatel>' upstream of the DNA complementaiy to the 
oligonucleotide, contains an EcoRW site 239 bp from the 
putative initiation codon of the 8 structural gene (62). There- 
fore, wc concluded that ca. 1,350 bp of the EcoRW-Bglll 
fragment was available to encode 5. 'Ilie molecular mass of 
5 is estimated from SPS-polyacrylamide gel electrophoresis 
to be about 34,000 Da. Thus, r> contains about 3iO amino acid 
residues. An open reading frame containing 310 oodons, or 
about 1 kb, would be required to encode 5. We concluded 
that the entire gene exists within the first 1,350 bp down- 
stream of ripB and is contained on the 1.6-kb EcoK\-Bg[ll 
fragment and on the 3.6-kb BgUl fragment. 

The restriction map in Fig. 3B was also used to determine 
the chromosomal map location of the structural gene for 5. 
Comparison of the map in Fig. 3B to the complete restriction 
map of the £. coli chromosome (25) revealed a single region 
corresponding to ca. 15 min, the map position reported for 
HpB (62); this is consistent with the 8 structural gene 
mapping downstream of rlpB^ HpB is part of the 15-kb cluster 
of genes including, in order, leuS (leucyl-tRNA synthetase), 
HpBf mrdA (peptidoglycan synthetase), mrdB, HpA (rare 
lipoprotein), and dacA (D-alanine carboxypeptidase) (19, 20, 
62). The map position of the S structural gene was more 
precisely determined by using the most recent^, coli restric- 
tion map, in which the physical and genetic maps of E, coli 
are correlated (53, 54). In this map. HpB is located at 14.6 
centisomes, or 682.5 kb. (The centisome is defined as 1% of 
the chromosome, or 47,736 bp, and is used in place of 
"minute" to avoid problems arising from the noncolinear 
relationship between the physical and genetic maps [53, 54]). 
HpB (525 bp) is immediately upstream of the gene encoding 
$ and is traa.scribed counterclockwise (62). Therefore, the 
structural gene for 5 begins at 682 kb and ends at 681 kb of 
the E. coli chromosome. 

Cloning of the structural gene for ft. Having tentatively 
identified the structural gene for 8, we proceeded to clone it 
and define its sequence to determine whether the mis- 
matches (Fig. 2B) were due to errors in the DNA or protein 
sequenced MAF102 chromosomal DNA was digested with 
BgHlj and DNA in the size range of 3.6 kb was purified. This 
DNA was cloned into the Bamlll site of pBlueScript II SK+ 
and transformed into XI.lBlue. Six hundred anipicillin- 
resLstant colonics were screened by colony hybridization 
(56), using the 5 1-mcr oligonucleotide 5' end labeled with ^^P 
to identify clones of the structural gene for 8. Three positive 
colonies were identified. Plasmid DNA from these three 
colonies was characterized by Sail restriction enzyme anal- 
ysis. In addition to the Sail site of pBlueScript II two 
Sad sites, separated by 286 bp, were predicted to be in the 
3.6-kb BglH fragment (62). Sail restriction analysis revealed 
the presence of this 2R6-bp fragment in plasmids obtained 
from all three colonies. Moreover, the identity of the Sail 
restriction maps of all three plasmids indicated that the same 
3.6-kb BgUl fragment had been ligated into pBlueScript II 
SK+ in all three isolates (data not shown). One of these 
three plasmids, pJRC102, was chosen for further study. 

DNA sequence analysis of the structural gene for d. Hie 
structural gene for 5 contained in pJRC102 was subjected to 
dtdeoxy chain termination sequencing (57). The primer used 
for the first sequencing reaction was a 17-mer complemen- 
taiy to DNA 89 bp upstream of the predicted initiation codon 
of the 6 structural gene (62), i.e., within the HpB gene, 
Clioosing to prime within the rigorously sequenced HpB gene 
provided confidence that the DNA sequence used to design 
the first primer was error free. The DNA sequence obtained 
with the first primer was used to design primers for add!- 
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FIG. 4, Strategy for sequencing the stiiiciuTal gene for h. The 
structural gene is depicted as the Cop horizontal line ard is divided 
into increments of 100 bp. Arrows indicate the extent and direction 
of sequencing performed by each of 11 primers (PI to Pll). AM of tlic 
gene was sequenced in both directions. 



rional sequencing reactions. This strategy (Fig. 4) was used 
repeatedly to gain double-strand sequence data for the entire 
open reading frame (Fig. 5). 

Compared to our sequence, the previous^ reported 
230-bp DNA sequence downstream of HpB (62) is missing 3 
bases and contains 1 extra base and 6 mismatched bases. 
Our limited sequence data of the 3' end of HpB arc in 
complete agreement with the sequence previously reported 
(62). The sequence presented in Fig. 5 contains one 1,029- 
base open reading frame, or 343 codons, and is predicted to 
encode a protein of 38,703 Da. This is in reasonable agree- 
ment with the size of 8, 34,000 Da, estimated by SDS- 
polyacrylamide gel electrophoTcsis. The identified open 
reading frame encodes all four partial amino add sequences 
reported in Fig. 1. 

The open reading frame contains an AUG initiation codon 
that overlaps the two adjacent termination codons of r{pB 
and that follows a potential weak ribosome-binding site (58) 
(Fig. 5). This site is 6 bases upstream of the initiation codon 
and is predicted to pair with the 16S rRNA through two G-C 
base pairs and one A-U base pair, thus satisfying the spacing 
and minimal base pairing requirements of natural ribosome- 
binding sites (60). No consensus promoter was identified 
upstream of the"gen^?"cncoding'6V These three genetic char- 
acteristics, overlap of initiation and termination codons, a 
weak rit>osome-binding site, and absence of a promoter, 
suggest that the primary mechanism of eTqire^slon of 6 is 
through translationai reinitiation foUowing translation of 
HpB (13). Thus, HpB and the gene encoding 5 may be part of 
ttie same operon. The possibility that a gene downstream of 
the gene encoding 8 is part, of this putative operon exists. An 
AUG codon exists only 1 bp after the termination codon of 
the gene for 8, and no termination codon exists in the 18 
codons downstream of this AUG. 

A striking feature of the DNA sequence is an inverted 
repeat (Fig. 6) beginning 24 bases downstream of the initia- 
tion codon of Ac gene encoding 8, i.e., within the gene. 
Transcription through this inverted repeat would produce 
mRNA with the potential to form a hairpin structure (calcu- 
lated AG = -27.8 kcal/mol (ca. -116 kJ/molj [63J). The 
hairpin is followed by a uridine-rich region, including four 
uridine residues in a row (Fig. 6). These two features, a 
hairpin structure followed by a string of uridine residues, are 
the hallmarks of a classical rho-independent transcription 
terminator (52). If biologically active, this terminator would 
provide a mcchaniiinn to attenuate expression of 8. 

We analyzed the codon usage of the gene encoding 6 
(Table 1) to determine the percentage of infrequently used 
codons in the gene. The set of eight rare codons (AUA, 
UCG, ecu, CCC, ACG, CAA, AAU, and AGG) was 
compiled from a survey of 25 noiu-egulatoiy genes in £. coli 
(26); these codons occur at a combined average of 3.5% in 
the reading frames of these genes. The subset of 10 ribosome 
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S' t*mi1nut of th« gon* oneodtng 6 
3' tmjnui of rloB 

V • ' 

MIRLYPEQLRAQLNEG 
1 CACAACGCATACTCCCGCAACGCCTGWCGCGTCTCCACCACMTCgaAACTCATGATTCeCTTGTACCCGI^^ 

LRAAYLLLGMDPlLtOESQOAVROVAAAOGFEE 
101 GfiCTGCGCGCGGCCTATCTTTTACTTCGTAAC»ATCCTCTGTTArTCCACG«UCCCAGGACCCTGnCGTCAC6TAGCT6^^ 

HHTFSIDPNTDUNAIFSLCQANSUFASRQTLLL 
201 ACACCACACTTTTTCCATTGATCCCAACACTGACTGGAATGCGATCTTTTCCTTATGCCAGGCTATGAGTCTCTTTQCCUeTCG^ 

LtPEMGPIIAAIIIEOLLTLTGLLHDDLLLlVRGIilC 
301 TTCTTACCAGAAAACGGACCGAATGCGGCGATCAATGAGOUkCTTCTCACACTCACCCGACTTCTGCATGACW 

LSKAOENAAUFTALAMRSVOVTCQTPEQAQLPR 
401 MTTAAGCAAAOCGCAAfiAAAATGCCCCCTGGTTTACTGCGCTTGCGAATCGCAGCGTGOkGCTGACCTCTCA^ 

UVAARAKQLHLELDDAANCIVLCYCYEGNLLALA 
501 CTGGGTT6CTGC6CGCGCAAAACAGCTCAACTTA6AACT6CATGACGCGGCAAATCAGGTCCTCTGCTACTGTTATGW 

OALE RLSLLUPDGICLT LPRV.EOAVNDAAIIFTPFH 
601 CAGGCACTGGAGCGTTTATCGCTGCTCTGGCCAGACGGCAAATIGACAnACCGCGCGTTGAACAGGCGGTGAATGATGCCGCGCAT 

UVDALLMGKSKRALHIlQQLRlEGSEPVltLRT 
701 ATTGGCTT6ATGCTTTG7TGAT66GAAAAAGTAAGCGCGCATTGCA7ATTCTTCAGCAACTfiC6TCT6GAAGeCAGC6AA(XGGTTA^ 

LQRELLLLVNLKRQSAHTPIRALFDKHRVUQNR 
801 ATTACAACGTGAACT6T7QTTACTG6TTAACCT6AAACQCCA0TCTCCCCATACGC«CT0C6TCCGTTQ7TTQATAAGCATCGGGTAT^ 

HCMMGEALMRLSOTOtROAVQLLTRTELTLKQDY 
901 CCGCGWTGATGCGCGACCCCTTAAATCCCTTAAGTCAGACGCACTTACCTCACGCCGTGCAACTCCTGACAC^ 

GQSVUAELE6LSLLICHKPIADVFIDG- . 
1001 ACGGTCAGTUGTGTCGGCAGAGCTGGAAGGGTTATCTCTTCTGTTGTGCCATAAACCCCTGGCGGACGTATTTATCGACGGTTGATATGAAA 
1011 ACGCTCTCTTTCGCCCCACCTTTCATCCCGTCCACTATGGTCATCTA ^ 
FIG. 5. DNA sequence of the structural gene for the ft subunit. The arrow indicates the 3' end of r^B and the 5' end of the structural gene 
for ft. The terminatioc codon(s] of ripB begins at nucleotide 26, and the 8 gene initiation codon begins at nucleotide 55. Double underlining 
indicates a potential Shine-Dalgamo site. Amino adds are represented in single-letter code above the first base of each codon. The dash 
mdicates the translational termination codec for the gene encoding 5. 



protein genes, which are very highl>' expressed, has an 
average of 1.7% rare coduns. This can be compared to the 
presumably random occurrence of rare codons in nonooding 
frames, an average of 11.7% (26). Konisbcrg and Godson 
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C.Q - 100 
U.G 
C.G 
90 - G.C 
A.U 

C,6 - 105 
G.C 
C.G 
85 - G.C 
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CCGGAACAAOi CGUAUCUUUUA 

FIG. 6. A potential traascription lerroinator in the 6 mKNA tran- 
script. TJie hairpin structure, which has a calculated AG of -27.8 
kcal/mol (ca. -116 kJ/mol), extends from bases 81 to 112 of the 
reported DNA sequence (Fjg. 5). The stem contains 11 GC base pairs, 
1 AU base pair, and 1 GU base pair and is followed on the 3' end by 
a region rich in uridine residues, inchiding a run of foiir uridines in a 
row. Numbcni comespond to the base numbermg of Fig. 5. 



have hypothesized that rare codon usage is partially respon- 
sible for the low expression of the replication gene dnaG 
(26). The gene encoding 8 contains 8.7% rare codons (Table 
2). Comparison of this value to values for six other £. coli 
genes (Table 2) indicates a strong bias of dnaG ^ ripB, and the 
structural gene for S toward use of rare codons, which could 
contribute to modulation of expression of these three genes. 
Rare codons occur with decreasing frequency in the four 
holoenzyme-subunit genes dnaQ^ dnaX^ dnoEj and dnoN, 

Amino acid sequence anatysifl of S. Translation of the DNA 
sequence indicates that 8 is a protein of 38,703 Da. The 
primary sequence shon^s a preponderance of leucine resi- 
dues, which often occur in stretches of 2 to 5 residues and 
which compose 20% of the protein. A search of the sequence 
for a consensus leucine zipper motif, which is characterized 
by 4 or 5 leucine residues spaced by 7 amino acid residues 
(29, 50), revealed none, even if functional amino acid sub- 
stitutions (18) are allowed in place of the characteristic 
leucine residues. Further Inspection of the sequence re- 
vealed no matches to other common protein motifs, e.g.» 
zinc finger or hclbc-tum-helix. Additionally, no significant 
primary sequence similarity was observed between 6 and the 
gene 44 or gene 62 product of phage T4. The function of the 
gene 44-gcnc 62 protein complex in T4 replication is analo- 
gous to that of the 78 complex; both complexes load their 
cognate polymerase clamps onto primed templates. 

The 78 complex of holoenzyme is a DNA-depcndent 
ATPase (49). Analysis of the primary sequence of 8 revealed 
a region of the protein that resembles the Walker A-consen- 



(613) 992-8247 Order # 01237334DP00853032 Thu Nov 9 04:15:23 2000 Page 3 of 8 



7020 CARTER ET AL. 



TABIJu 1. Codon asagc within the gcoc cncodiag 5 



Codon 


Amino acdd 


TiO. 01 
V1H1C3 utfcu 


% of total 
codons 


uuu 


Pbc 


7 


2.0 


uuc 


Phc 


2 


0,6 


Total 


Phe 


9 


2.6 


UUA 


Leu 


14 


4.0 


uuo 


Leu 


14 


4.0 


cuu 


Leu 


8 


2.3 


cue 


Leu 


10 


2.9 


CUA 


Leu 


1 


0.2 


CUG 


Leu 


23 


6.7 


Total 


1-cu 


70 


20,4 


AtJlJ 


He 


4 


LI 


AUC 


He 


4 


LI 


AUA- 


He 


0 


0 


Total 


lie 




2.3 


AUG 


Met 


3 


1.4 


GUU 


Val 


6 


1,7 


GUC 


Val 


I 


0.2 


GUA 


Val 


3 


0.8 


OUO 


Val 


6 


L7 


Total 


Val 


16 


4.7 


UCU 


Ser 


2 


0.5 


UCC 


Scr 


1 


0.2 


UCA 


Ser 


1 


0.2 


UCG" 


Scr 


2 


0.5 


AGU 


Ser 


4 


LI 


AGC 


Ser 


4 


LI 


Total 


Ser 


14 


4,1 


CCU" 


Pro 


2 


0.5 


CCC" 


Pro 


3 


0.8 


CCA 


Pro 


3 


0.8 


CCG 


Pro 


5 


L4 


Total 


Pro 


13 


3.8 


ACU 


Thr 


3 


0.8 


ACA 


TKr 


5 


L4 


ACG« 


Thr 


4 


LI 


ACC 


Thr 


4 


LI 


ToUl 


Thr 


16 


4.7 


GCU 


Ala 


7 


2.0 


GCC 


Ala 


6 


1.7 


GCA 


Ala 


6 


1.7 


GCG 


Ala 


18 


5.2 


Total 


Ala 


37 


10.8 


UAU 


Tyr 


2 


0.8 


UAC 


Tyr 


3 


0.8 


Total 


Tyt 


5 


1.5 


CAU 


His 


7 


2.0 


CAC 


His 


2 


0.5 


Total 


His 


9 


2.6 


CAA" 


OIn 


9 


2.6 


CAG 


GId 


20 


5.8 


Total 


GId 


29 


8^5 


AAU' 


Asn 


10 


2.9 


AAC 


Asn 


7 


2.0 


Total 


Asn 


17 


5.0 


AAA 


Lys 


8 


2.3 


AAG 


Lys 


2 


0.5 


Total 


Lys 


10 


2.9 


GAU 


Asp 


7 


2.0 


GAC 


Asp 


8 


23 


Total 


Asp 


15 


4.4 


GAA 


Glu 


15 


4.3 


GAG 


Glu 


5 


1.4 


Total 


Glu 


20 


5.8 


UGU 


cys 


2 


0.5 


UGC 


Cys 


3 


0.8 


Total 


cys 


5 


L5 
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TABLE 1 — Continued 



Codon 


Amir.o acid 


No. of 
dmes used 


% of total 
codons 


UGG 


Trp 


7 


2.0 


CGU 


Arg 


6 


L7 


CGC 


Arg 


12 


3.4 


CGA 


Arg 


2 


0.5 


CGG 


Arg 


3 


0.8 


AGA 


Arg 


fl 


0 


AGO* 


Arg 


0 


0 


Total 


Arg 


23 


6.7 


GGU 


Gly 


5 


1.4 


GGC 


Oly 


4 


< - 


GGA 


Gly 


4 




GOG 


Gly 


2 


o!5 


Total 


Oly 


15 


4.4 



" Rare oodnns (26). 



sus ATP-binding site, (Ala/Gly-lHX-2~X-3^X-^X-5)-Gly- 
6-Lys-7-(Scr/Thr-8) (16, 44, 69). The sequence in 5 between 
amino acid residues 219 and 225 is Ala-l-(Leu-2--Leu-3-Met- 
4)-Gly-6-I-ys-7-Ser-8. A major deviation of this sequence 
from the consensus is the existence of 3 residues instead of 
4 between Ala-1 and Gly-6. We know of no example of a 
characterized ATP-binding site with this deviation from the 
consensus. In addition, 35 of 37 ATP-binding sites (compiled 
from references 16, 17, 44, and 69) contain a Gly at position 
X-4. The two exceptions, RecA and DnaB, contain a Ser 
residue at this position. Finally, whereas ATP-binding sites 
form a loop between a 3-shett and an a-helix (44), two 
different secondary structure predictions for 5 (6, 12) depict 
the putative ATP-binding site as being within a region rich in 
a-helical structure and devoid of (i-sbeets (data not shown). 

Constraction of a 5-ovcrprDducftng plasmid. As a final test 
that the structural geoe for 5 had beenisolated, we placed it 
in an expression veaor to determine whether the gene 
directed synthesis of a protein the size of 5. The overexpres- 
sion vector, which contains the strong tac promoter, has 
been used to overproduce several holoenzymc subuoits and 
HIV proteins. Our strategy involved cloning all but the first 
13 bases of the 8 open reading frame into pRT581 (a plasmid 
that overproduces the p51 subunit of HIV reverse tran- 
scriptase [62a]), removing the p51 open reading frame, and 
reconstructiog the 5' end of the 8 structural gene with 
synthetic oligonucleotide designed to replace rare codons 
with more commonly used synonymic codons (Fig. 7). 



TABUB 2. Percentage of rare codons' in the three reading 
frames of seven E. coti genes 



Gene* 




% in: 




Frame T 


Frame 2 


Frame 3 


Gene encoding 6 


fi.7 


11.4 


8.7 


HpB 


9.8 


13.4 


6.2 


dnaE 


4.9 


13.4 


10.8 


dnaN 


4.1 


12-5 


11.4 


dnaX 


6.8 


15.6 


6.7 


dnaQ 


7.0 


13.2 


11.5 


dnaC 


11.3 


12.4 


12.9 



- Rare codons include AUA (Ue), UCG (Scr). COJ and CCC (Pro), ACG 
(Thr). CAA (Gto)p AAU (Asn). and AGG (Arg) (26). 

* Sequence data were derived from iht fjoHow'mg sources: dnaG (26), HpB 
(6Z), dnoE (64), dnaN (48), dnaX (10), and 4inaQ (32). 

' F^ame 1 * coding fhoiK. 
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1 2 3 4 5 6 



B/O^ pJRCIOZ 



&«gin.crt V K 



pUC19 1 



Pkrtiil/ZjDl 




Oligonucleotide to recoostnia 
5' end of the gene encoding h 



FIO. 7. Coast ruction of a plasmid that ovcrexprcsses 8. pJRC102 
contains the 3.6-kb BglU S-gene-containijig fragment of MAF102 
chromosotmal DNA^ cloned into the BamHl sil« ol pBlueScripl 0 
SK*!*- The plasrnid wa>} digested with RsaL A 1.5-lcb fragment, 
containiog all but the first 10 nucleotides of coding region, was 
cloned into pUC19 at the Rsal site which is internal to the unique 
fQ?nl site of pUC19, gcoeiating pJRC103. The 1.5-kb fragment of 
pJRC103 was cloiiul lato the unique i^nl site of pRTSSX. The 
resulting plasmid, pJRC104, was digested with BglH and I^nl to 
remove all but the last 14 codons of the p51 open reading frame. 
Cloned into Bglll- and ^K^l-digested pJRC104 was a synthetic, 
duplex oligonucleotide which served to reconstruct the 5' end of 
holA and provide a consensus Shine-Datgaroo site, an AJT region 
between the initiation codon and the Shine-Dalgamo site, and an 
adenosine residue 3 nucleotides S' to the initiAti<m codon. The 
coding strand of the duplex oligonucleotide is S'-GATCTAGGAG 
GTAATAAATAATGATCCGCCTGTAC-3'. The product of this 
cloning, pJRC105, was the plasmid used to overexprcss B. Restric- 
tion site abbreviations: G, BgOX; K, jqpnl; Rs, Rsah S, Sail, The 
cuixxd arrow represents the gene encoding 5. Two short parallel 
lines at the beginning of the amiw indicate truncation of the gene 
caused by digestion with Rsah 




FIG. 8. Overexpression of the ft subunit. (A) Cells were grown to 
an^tjoo of 0.4, induced with IPTG for 4 h, and lysed as described in 
Materials and Methods. Proteins of the cell lysates were resolved by 
SDS-polyacrylamide gel electrophoresis and stained with 
Coomassie briltiant blue. Lysates frona equal masses of cells were 
added to each lane of the gel. Lane 1, HBlOl; lane 2, HBIOI, IPTG 
induced; lane 3, HBlOl containitig pJCl, the HIV nucleocapsid 
protein ovcrproduccr; lane 4, HBIOl containing pJCl, IPTO in- 
duced; lane 5, JRC107, which contains the 6 overexpression plasmid 
pJRC105; lane 6, JRC107, IFTG induced. The migration of purified 
holoenzyme subunits is indicated on the right; the positions of 
molecular mass markers, in Idlodaltons, arc on the left. (B and C) 
Densitoractric scans of lanes 5 and 6, respectively. The origin of the 
lanes is to the left, and the arrows indicate the locations of 
holoenzyme subunits. 



Overexpression of 6. The candidate 6 expression plasmid 
(pJRClOS, Pig. 5) was transformed into HBlOl to generate 
strain JRC107, which was used to test whether the isolated 
gene could produce a protein the size of the 8 subunit. Cells 
were induced by treatment with IPTG. In JRC107, IPTG- 
dependent production of a protein that comigrates with the 8 
subunh from purified holoenzyme was detected (Fig. 8A, 
compare lanes 5 and 6). HBlOl was used as a control to 



ensure that no protein the size of 8 is produced upon IPTG 
induction of a strain lacking plasmid (Fig. 8A, lanes 1 and 2). 
HBlOl containing pJCl, which overproduces HIV nucleo- 
capsid from the same vector used to construct pjRClOV (76), 
was tested as another control; IPTG induction of this strain 
causes production of the 6y380-Da nucleocapsid protein and 
no protein the size of 8 (Fig. 8A, lanes 3 and 4). We conclude 
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that overproduction of a protein the size of 8 occurs only in 
IPTG-iiiduced cells cajrying a plasmid with the gene pre- 
dicted to encode 6. On the basis of densitometric scans of 
lane 5 (noninduccd JRC107) and lane 6 (IKlXj-induced 
JRC107) (Fig. 8B and C, respectively), we calculate that the 
b subunit represents approximately 4% of the total soluble 
protein of IPTG-induced JRC107. 

DISCUSSION 

The primary goal of this work was to isolate and charac- 
terize the structural gene for the 8 subunit of DNA polymer- 
ase III holoenzymc. We took an approach to cloning the 
structural gene for 6 in which wc determined the amino- 
terminai and several internal sequences for h and used this 
information to devise a strategy to map the gene and to 
screen for clones of it. 

A preliminary search of sequence data bases revealed 
strong but imperfect matches between two of the four 
sequences obtained for 6 in a 110-bp stretch of DNA 
downstream of rlpB, Because of the predicted mismatches 
and deletions relative to the amino-terminal sequence of 6 
and becatise two diffeient reading frames were required in 
order to align the sequences, wc isolated the candidate gene 
from a wild-type £. coli K-12 suain to determine whether the 
mismatches were due to sequencing errors or whether the 
sequence identified in the data base was from an inert 
pseudogene. 

For a hybridization probe, we synthesized a 51-mcr oligo- 
nucleotide based on 5 sequences that matched an amino acid 
sequence that agreed with both experimentally obtained 8 
sequences and sequences identified in the data base. This 
sequence hybridized to a unique set of restriction fragments 
generated from the E, coU chromosome. A map generated 
from these rcstricticHi digestions was consistent with the 
region of the E. coH chromosome downstream of r!pB^ the 
originally Identified locus for the 8 structural gene. We 
cloned this region and sequenced it to determine whether the 
imperfect alignment was due to errors in the reported DNA 
sequence. 

The 1,029-bp open reading frame identified from the 
sequence aligned perfectly in one reading frame with all four 
8 protein sequences and encoded a protein of 38,703 Da, 
consistent with the expected size of 8. That the isolated 
candidate gene aligned with protein sequences not contained 
in the data base and not contained in our probe added 
confideace that we had identified the authentic structural 
gene for 8. An ovcrproduciiig plasmid constructed to express 
6 using the tac promoter, an optimal ribosome-binding site, 
and additional sequence features included to aUow high-level 
gene expression directed synthesis of a protein that comi- 
grated with the 5 subunit of holoenzyme. Prcliminaiy data 
(5a) indicate that the candidate overproducing strain ex- 
presses 8 functional activity, based on its ability to reconsti- 
tute hoioenzyme activity at least 50-fold over a control 
strain. Together, these data indicate that we have isolated 
the structural gene for 8, a key step in the study of holoen- 
zymc activity and of the regulation of holoenzyme gene 
c3q3ression. The O'DonncIl laboratory has independently 
cloned the structural gene for 8 (7a). As suggested by K. 
Marians (Sloan Kettering), the O'Donnell laboratory and our 
laboratory have agreed to designate the gene hoiA. 

Sequence analysis revealed that the initiation codon of 
hoiA overlaps the two adjacent termination codons of ripB. 
Although a weak Shine-Dalgamo sequence can be identified 
6 bases upstream of the hoL4 initiation codon, the closest 



consensus promoter upstream of hol4 is the HpB pro- 
moter. Therefore, it is likely that holA is transcribed as part 
of a polycistrouic message initiating at the HpB promoter. 
The presence of an AUG start codon 1 bp downstream of 
holA followed by 18 nontermination codons raises the i>os- 
sibility that at least one additional gene is transcribed from 
the ripB promoter. The proximity of the hoLA Shine-Dal- 
garno site to the HpB termmatlon codon(s) (4 bp) and the 
overiap of termination and initiation codons suggest that 
translatlonal reinitiation, following translation of HpB 
mRNA, could be involved in ejq)ression of holA (13). A 
genetic organization of this kind is typical of gcues coordi- 
nately controlled as part of one operon. 

A second mechanism of regulation of 8 expression could 
occur at the level of transcription termination. We find a 
classical rho-indepcndent transcription terminator located 
shortly after the initiation codon oikolA. Since transcription 
and translation are coupled in procaryotcs, translation of 
holA mRNA, initiated by tntnslalional reinitiation, would 
inactivate the transcription terminator. However, when 
translation does not initiate inmiediately, transcription 
would prtxluce an mRNA that is able to adopt the strong 
hairpin structure, and transcription termination would oc- 
cur. By this mechanism, differential production of RlpB and 
8 could be effected. 

A further level of regulation might occur at the level of 
codon usage. The frequency of rare codons in holA , 8.7%, is 
2.5-fold higher than tliat in nonrcgulatory genes and 5-fold 
higher than that in highly expressed ribosome subunit genes. 
Thus, a bias toward rare-codon usage might contribute to the 
low level of 8 production. As a component of holoenzyme, 8 
is predicted to have a copy number of 10 to 20 nwlecules per 
cell. However, caution should be used in attributing a major 
effect on gene expression to rare-codon usage. Although 
codon usage bias probably has some effect, it is most likely 
of secondary importance to features such as the strengths of 
promoter or Shine-Dalgamo sites (7) or possible regulation 
of the HpB promoter. 

As mentioned, HpB^ the gene encoding a rare lipoprotein, 
and hoL4 may exist in one operon. It is also possible that one 
or more of the remaining genes in the leuS-dacA region of 
the diromosome are in this putative operon. By being in the 
same operon, holA and HpB would be subject to, for 
example, a modulator of transcription originating from the 
HpB promoter. The possibBity that holA and HpB arc coreg- 
ulatcd might be a clue to the more general possil)ility that the 
two macromolecular synthesis systems, chromosomal repli- 
cation and cell envelope biogenesis, arc coordinatcly regu- 
lated; the putative operon coniaintog HpB and holA would 
contain genes involved in at least these two distinct macro- 
molecular biosynthetic processes. A link between repHca- 
tion and cell envelope biogenesis proteins has been reported 
by Sakka ct al. (55), who showed that the catalytic a subunh 
of holoenzyme negatively regulates expression of signal 
peptidase 11, the enzyme that cleaves the leader peptide from 
prolipoproteins, a requisite step for lipoprotein insertion into 
the membrane. Thus, one subunit of hotoenzymc, a, is 
apparently involved in the regulation of the signal peptidase 
II that processes Che product of the gene upstream of (and 
possibly coregutated with) holA to a biologically active form. 
Whether this leads to a relevant genetic circuit remains 
unknown. Although grouping of genes whose products arc 
functionally dissimilar is not common, at least two major 
examples exist in £. colL The first, the macromolecular 
.synthesis operon, encodes the dnaG (primase), rpoD (a^** 
subunit of RNA polymerase), and rpsU (S21 ribosome 
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subuilit) geaes (5, 31), and the second, the macromolecular 
synthesis 11 operon, encodes dtuzE (a polymerase subunit of 
holoenzyme), cdsA (CDP-diglyccride synthetase), IpxA 
(UDP-A^-acelylglucosamine acetyltransferase), and IpxB 
(lipid A disaccharide synthase) (63a). 

The S subunit, as part of the 78 complex, is required to 
load the p subunit onto a primed template in a reaction that 
requires ATP hydrolysis. In a study to determine which 
components of the -y8 complex (7, 5, 8', x> ^d x|r) hydrolyze 
ATP, Onrust ct aL (49) found that neither y nor 8 alone is a 
DNA-dependent ATPase but that the two subunits combined 
have significant ATPase activity. Given that 7 binds ATP 
(66) and that an alternate product of the gene encoding 7 
{dnaX [3, 11, 67]), is a DNA-depcndcnt ATPasc (30), the 
simplest interpretation is that 8 stimulates the intrinsic 
ATPase activity of 7 but is not itself an ATPase. 

AVhcreas our analysis of the phmaiy amino acid sequence 
of 8 did reveal a possible ATP-binding site based on similar- 
ity to the Walker A-consensus ATP-binding site, several 
observations argue against the presence of a biologically 
active ATP-binding site in 8. First, the putative ATP-binding 
site of 8 has 3 residues between the conserved Ala-1 and 
Gly-6, instead of the 4 residues found in the consensus motif. 
We were unable to find a single example of an ATP-biuding 
protein with this deviation from the consensus. Second, the 
ATP-binding sites of previously characterized ATP-binding 
proteins almost always have a glycine residue at position 4 of 
the consensus. Thirty-five of the 37 ATP-binding proteins 
presented in references 16, 17, 45, and 69 have this glycine 
residue; the two exceptions, RecA and DnaB, have a serine. 
The ATP-binding site of 8 contains neither a glycine nor 
serine residue at this position. Finally, the consensus ATP- 
binding site must fonn a loop between an a-helix and a 
p-shcct. Analysis of the secondary structure of 8 by two 
different methods predicts that any ATP-binding site of 8 is 
buried within a long a-helix. For these reasons, it is unlikely 
that 8 contains an ATP-binding site. Studies to address this 
issue experimentally await purification of 8 to homogeneity. 
Work is currently underway to purify 8 from our overpro- 
ducing strain and to examine the contribution of this subunit 
to the functional specialization of the two halves of the 
asymmetric, dimen'c DNA polymerase III holoenzyme. 
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